This brief notebook is based on the article of D Higham, An Algorithmic Introduction to Numerical Simulation of Stochastic Differential Equations, SIAM Review 43:525-546 (2001). Higham provides Matlab codes illustrating the basic ideas at http://personal.strath.ac.uk/d.j.higham/algfiles.html. This is a Python reimplementation.
In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
In [2]:
T = 1.0 # End time
N = 500 # Number of steps
dt = T / N
t = np.linspace(0, T, N)
W = np.zeros(N)
dW = np.zeros(N)
dW[0] = np.random.randn()
W[0] = dW[0]
# This defines the Wiener process, or Brownian motion
for j in range(1, N):
dW[j] = np.sqrt(dt) * np.random.randn()
W[j] = W[j-1] + dW[j]
plt.plot(t, W)
plt.xlabel(r'$t$')
plt.ylabel(r'$W(t)$')
plt.show()
Of course, that method of constructing the process is notably inefficient (thanks to the loop). We will vectorize it first.
In [3]:
T = 1.0 # End time
N = 500 # Number of steps
dt = T / N
t = np.linspace(0, T, N)
dW = np.sqrt(dt) * np.random.randn(N) # All increments are independent, so initialize the lot in one go
W = np.cumsum(dW) # The Brownian motion is the cumulative sum of all increments
plt.plot(t, W)
plt.xlabel(r'$t$')
plt.ylabel(r'$W(t)$')
plt.show()
A single path by itself is not very useful. Instead we should look at an ensemble of paths, and of functions of paths.
The function we choose is $U = \exp \left[ t + \tfrac{W}{2} \right]$ where $W$ is the Brownian path.
In [4]:
T = 1.0 # End time
N = 500 # Number of steps
dt = T / N
t = np.linspace(0, T, N)
M = 1000 # Will construct M paths simultaneously
dW = np.sqrt(dt) * np.random.randn(M, N) # Now construct the increments for all points and paths
W = np.cumsum(dW, 1) # Take the cumulative sum for each path separately
U = np.exp(t + 0.5*W) # Construct a function of the Brownian path
Umean = np.mean(U, 0) # Take the mean across all paths
# Compare against the expected average solution
print("Average error is {}".format(np.linalg.norm((Umean - np.exp(9*t/8)), np.inf)))
plt.plot(t, Umean, 'b-')
for i in range(5):
plt.plot(t, U[i, :], 'r--')
plt.xlabel(r'$t$')
plt.ylabel(r'$U(t)$')
plt.legend(('Mean of all paths', 'Some individual paths'), loc = 'upper left')
plt.show()
So we can define and compute the random Brownian process $W$ and construct functions of it, averaging over the various realisations, or paths, to get a limiting value.
Next we want to compute the integral of a function of a random process, $\int_0^T h(t) dW(t)$. There are two standard approaches: these given different answers.
We compare the different Ito and Stratonovich approaches below, simply applied to computing $\int_0^T W dW(t)$.
In [5]:
T = 1.0
N = 500
dt = T / N
dW = np.sqrt(dt) * np.random.randn(N)
W = np.cumsum(dW)
shiftedW = np.zeros(N)
shiftedW[1:] = W[:-1]
# The two different integrals.
# The Ito integral is roughly the Riemann integral evaluated at the left edge of the subinterval
ito = np.sum(shiftedW*dW)
# The Stratonovich integral is roughly the Riemann integral evaluated at the centre of the subinterval
stratonovich = np.sum((0.5*(shiftedW+W) + 0.5*np.sqrt(dt)*np.random.randn(N))*dW)
# Note that the exact solutions are different - markedly so!
print("The Ito integral is {} with error {}".format(ito, np.abs(ito - 0.5*(W[-1]**2-T))))
print("The Stratonovich integral is {} with error {}".format(stratonovich, np.abs(stratonovich - 0.5*W[-1]**2)))
We now look at solving Stochastic differential equations. We will write these in the form
\begin{equation} dX(t) = f(X(t)) dt + g(X(t)) dW(t). \end{equation}The initial conditions are $X(0) = X_0$, and we solve for $0 \le t \le T$.
The simplest extension of the standard Euler method for solving initial value problems for an ODE is the Euler-Maruyama method. Here we apply that to the simple linear SDE \begin{equation} dX(t) = \lambda X(t) + \mu X(t) dW(t). \end{equation} For definiteness we choose $\lambda = 2, \mu = 1, X_0 = 1, T = 1$.
In [7]:
lam = 2.0
mu = 1.0
X0 = 1.0
T = 1.0
N = 2**8
dt = T / N
t = np.linspace(0, T, N)
dW = np.sqrt(dt) * np.random.randn(N)
W = np.cumsum(dW)
# The "true" (average, or expected) solution
Xtrue = X0 * np.exp((lam - 0.5*mu**2)*t + mu*W)
# Downsample the number of points.
# Note that we average the increments.
R = 4
Dt = R * dt
L = int(N / R)
tt = np.linspace(0, T, L)
# Initial data
Xem = np.zeros(L)
Xem[0] = X0
# Euler-Maruyama applied to this SDE
Xtemp = X0
for j in range(L):
Winc = sum(dW[R*j:R*(j+1)])
Xtemp += (Dt * lam + mu * Winc) * Xtemp
Xem[j] = Xtemp
# Note that this is resetting, or overwriting, the initial data point.
# I find this slightly confusing.
print("Error at the endpoint is {}".format(np.abs(Xem[-1] - Xtrue[-1])))
plt.plot(t, Xtrue, 'm-')
plt.plot(tt, Xem, 'r--*')
plt.xlabel(r'$t$')
plt.ylabel(r'$X$')
plt.legend(('Expected solution', 'Realized solution'), loc = 'upper left')
plt.show()
As ever with a numerical method we need to check its convergence behaviour. What does this mean for a Stochastic DE?
There are two types of convergence measure of a random variable: strong and weak. For strong convergence we expect the random variable (or function) to converge, on average, pointwise: \begin{equation} \lim E| X(t_n) - X_n| \to 0. \end{equation} The convergence rate, $\gamma$, is defined to be the smallest constant such that, for constant $C$, \begin{equation} E| X(t_n) - X_n | \le C \Delta t^{\gamma} \end{equation} when the timestep $\Delta t$ is sufficiently small. Measuring this at a single point, such as the endpoint $t = T, n = N$ gives us a single number to check.
Weak convergence just says that the average error converges, which is \begin{equation} \lim | E(X(t)) - E(X_n) | \to 0. \end{equation} This is measured in a similar fashion using the convergence rate \begin{equation} | E(X(t)) - E(X_n) | \le C \Delta t^{\gamma}. \end{equation}
We test this on the linear SDE above.
In [9]:
lam = 2.0
mu = 1.0
X0 = 1.0
T = 1.0
N = 2**9 # Number of steps at finest resolution
dt = T / N
M = 1000 # Number of paths to sample
X_strong_error = np.zeros((M, 5))
Dtvals = np.zeros(5)
for s in range(M):
dW = np.sqrt(dt) * np.random.randn(N)
W = np.cumsum(dW)
Xtrue = X0 * np.exp((lam - 0.5*mu**2)*T + mu*W[-1])
for p in range(5):
R = 2**p
Dt = R * dt
Dtvals[p] = Dt
L = int(N / R)
Xtemp = X0
for j in range(L):
Winc = np.sum(dW[R*j:R*(j+1)])
Xtemp += (Dt*lam + mu*Winc)*Xtemp
X_strong_error[s,p] = np.abs(Xtemp - Xtrue)
strong_error = np.mean(X_strong_error, 0)
# Measure the convergence rate
p = np.polyfit(np.log(Dtvals), np.log(strong_error), 1)
print("Measured convergence rate is {}".format(p[0]))
plt.loglog(Dtvals, strong_error, 'b*-')
plt.loglog(Dtvals, np.exp(p[1])*np.sqrt(Dtvals), 'r--')
plt.xlabel(r'$\Delta t$')
plt.ylabel(r'Sample average of $|X(t) - X_L|$')
plt.legend((r'Measured errors', r'Expected convergence rate $\gamma = \frac{1}{2}$'), loc = 'upper left')
plt.show()
For the weak convergence test we have to apply it to a lot more paths to get something sensible.
In [10]:
lam = 2.0
mu = 1.0
X0 = 1.0
T = 1.0
N = 2**9 # Number of steps at finest resolution
dt = T / N
M = 100000 # Number of paths to sample
X_weak = np.zeros(5)
Dtvals = np.zeros(5)
for p in range(5):
R = 2**p
Dt = R * dt
Dtvals[p] = Dt
L = int(N / R)
Xtemp = X0 * np.ones(M)
for j in range(L):
Winc = np.sqrt(Dt)*np.random.randn(M)
Xtemp += (Dt*lam + mu*Winc)*Xtemp
X_weak[p] = np.mean(Xtemp)
weak_error = np.abs(X_weak - np.exp(lam))
# Measure the convergence rate
p = np.polyfit(np.log(Dtvals), np.log(weak_error), 1)
print("Measured convergence rate is {}".format(p[0]))
plt.loglog(Dtvals, weak_error, 'b*-')
plt.loglog(Dtvals, np.exp(p[1])*Dtvals, 'r--')
plt.xlabel(r'$\Delta t$')
plt.ylabel(r'$|E(X(t)) -$ Sample average of $X_L|$')
plt.legend((r'Measured errors', r'Expected convergence rate $\gamma = 1$'), loc = 'upper left')
plt.show()
As the results above show, the convergence rate of the Euler-Maruyama method are not wonderful. Higher order methods for SDEs can be constructed using Ito's Lemma - the stochastic version of Taylor's theorem - in the same way that, for example, standard Runge-Kutta methods are constructed.
Here we follow Higham by applying Milstein's method to the SDE \begin{equation} dX = r X (k - X) dt + \beta X dW(t) \end{equation} with initial data $X(0) = X_0 = 0.5$ and parameter values $r = 2, k = 1, \beta = 1/4$. Checking the strong convergence rate shows the improvements with this method.
In [12]:
r = 2.0
K = 1.0
beta = 0.25
X0 = 0.5
T = 1.0
N = 2**11
dt = T / N
M = 500
dW = np.sqrt(dt) * np.random.randn(M, N)
X_mil = np.zeros((M, 5))
Dtvals = np.zeros(5)
# We will do the comparison against a reference solution
R = [1, 16, 32, 64, 128]
for p in range(5):
Dt = R[p]*dt
Dtvals[p] = Dt
L = int(N / R[p])
Xtemp = X0 * np.ones(M)
for j in range(L):
Winc = np.sum(dW[:, R[p]*j:R[p]*(j+1)],1)
Xtemp += Dt*r*Xtemp*(K-Xtemp) + beta*Xtemp*Winc + 0.5*beta**2*Xtemp*(Winc**2 - Dt)
X_mil[:, p] = Xtemp
X_ref = X_mil[:, 0]
strong_error = np.zeros(4)
for i in range(1, 5):
strong_error[i-1] = np.mean(np.abs(X_mil[:, i] - X_ref))
# Measure the convergence rate
p = np.polyfit(np.log(Dtvals[1:]), np.log(strong_error), 1)
print("Measured convergence rate is {}".format(p[0]))
plt.loglog(Dtvals[1:], strong_error, 'b*-')
plt.loglog(Dtvals[1:], np.exp(p[1])*Dtvals[1:], 'r--')
plt.xlabel(r'$\Delta t$')
plt.ylabel(r'Sample average of $|X(t) - X_L|$')
plt.legend((r'Measured errors', r'Expected convergence rate $\gamma = 1$'), loc = 'upper left')
plt.show()
Finally, we look at the linear stability of the numerical methods. In particular we focus on the Euler-Maruyama method applied to the simple SDE \begin{equation} dX = \lambda X dt + \mu X dW(t) \end{equation} with initial data $X(0) = X_0 = 1$ and parameters $\lambda = -3, \mu = \sqrt{3}$.
Testing linear stability is when $\lim_{t \to \infty} |X(t)| \to 0$, which depends on the timestep taken.
In [14]:
# This is the mean square case
T = 20.0
M = 50000
X0 = 1.0
lam = -3.0
mu = np.sqrt(3.0)
dt = 0.25
N = 80
plt.subplot(2,1,1)
for k in range(3):
Dt = 2**k * dt
L = int(N / 2**k)
Xms = np.zeros(L)
Xtemp = X0 * np.ones(M)
t = np.linspace(0, T, L)
for j in range(L):
Winc = np.sqrt(Dt) * np.random.randn(M)
Xtemp += (Dt*lam + mu*Winc)*Xtemp
Xms[j] = np.mean(Xtemp**2)
plt.semilogy(t, Xms)
plt.legend((r'$\Delta t = \frac{1}{4}$', r'$\Delta t = \frac{1}{2}$', r'$\Delta t = 1$'), loc = 'lower left')
plt.title(r'Mean-Square: $\lambda = -3, \mu = \sqrt{3}$')
plt.ylabel(r'$E[X^2]$')
# This is a single asymptotic path
T = 500.0
plt.subplot(2,1,2)
for k in range(3):
Dt = 2**k * dt
L = int(N / 2**k)
Xemabs = np.zeros(L)
Xtemp = X0
t = np.linspace(0, T, L)
for j in range(L):
Winc = np.sqrt(Dt) * np.random.randn(1)
Xtemp += (Dt*lam + mu*Winc)*Xtemp
Xemabs[j] = np.abs(Xtemp)
plt.semilogy(t, Xemabs)
plt.legend((r'$\Delta t = \frac{1}{4}$', r'$\Delta t = \frac{1}{2}$', '$\Delta t = 1$'), loc = 'lower left')
plt.title(r'Single Path: $\lambda = -3, \mu = \sqrt{3}$')
plt.ylabel(r'$|X|$')
plt.show()